Depression Detection Using Automatic Transcriptions of De-Identified Speech
نویسندگان
چکیده
Depression is a mood disorder that is usually addressed by outpatient treatments in order to favour patient’s inclusion in society. This leads to a need for novel automatic tools exploiting speech processing approaches that can help to monitor the emotional state of patients via telephone or the Internet. However, the transmission, processing and subsequent storage of such sensitive data raises several privacy concerns. Speech deidentification can be used to protect the patients’ identity. Nevertheless, these techniques modify the speech signal, eventually affecting the performance of depression detection approaches based on either speech characteristics or automatic transcriptions. This paper presents a study on the influence of speech de-identification when using transcription-based approaches for depression detection. To this effect, a system based on the global vectors method for natural language processing is proposed. In contrast to previous works, two main sources of nuisance have been considered: the de-identification process itself and the transcription errors introduced by the automatic recognition of the patients’ speech. Experimental validation on the DAIC-WOZ corpus reveals very promising results, obtaining only a slight performance degradation with respect to the use of manual transcriptions.
منابع مشابه
Manual and Automatic Transcriptions in Dementia Detection from Speech
As the population in developed countries is aging, larger numbers of people are at risk of developing dementia. In the near future there will be a need for timeand cost-efficient screening methods. Speech can be recorded and analyzed in this manner, and as speech and language are affected early on in the course of dementia, automatic speech processing can provide valuable support for such scree...
متن کاملComparison of forced-alignment speech recognition and humans for generating reference VAD
This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was empl...
متن کاملSpeech recognition based confidence measures for building voices from untranscribed speech
Today, large amount of audio data is available on the web in the form of audiobooks, podcasts, video lectures, video blogs, news bulletins. In addition, we can effortlessly record and store audio data such as read/lecture/impromptu speech on hand-held devices. These data are rich in prosody, provide a plethora of voices to choose from, and their availability can significantly reduce the overhea...
متن کاملDepression Severity Estimation from Multiple Modalities
Depression is a major debilitating disorder which can affect people from all ages. With a continuous increase in the number of annual cases of depression, there is a need to develop automatic techniques for the detection of the presence and extent of depression. In this AVEC challenge we explore different modalities (speech, language and visual features extracted from face) to design and develo...
متن کاملOn the sufficiency of automatic phonetic transcriptions for pronunciation variation research
W e investigated whether automatic phonetic transcriptions (APTs) can replace manually verified phonetic transcriptions (MPTs) in a large corpus-based study on pronunciation variation. To this end, we compared the performance o f both transcription types in a classification experiment aimed at establishing the direct influence o f a particular situational setting on pronunciation variation. W e...
متن کامل